The Impact of Rushing vs. Passing Offense in the NFL
In the National Football League or NFL, there are two main ways to score points. Rushing or passing the ball. Both methods have their own set of strengths and weaknesses. This leads to teams carefully planning how to divide their offensive playing time between the two styles. Sometimes teams plan on mostly passing the ball while other teams try and rush the ball for the majority of the time. This lead me to ponder what style of offense leads to teams winning more games. Is a ruthless passing attack the best way to win or is a violent ground game the path to victory?
How does the team’s rushing statistics per game affect the team’s win-loss ratio?
How does the team’s pass statistics per game affect the team’s win-loss ratio?
How should teams run their offense to achieve their best potential win-loss ratio?
To answer my research questions I pulled data from three different sources and two different Kaggle pages. The first and second sources can be found here. This site includes the rushing and passing statistics for all teams from 2002 until 2017. Those two data sets were merged so that I could combine their unique variables. Then they were combined a second time with the third data set which can be found here. This third data set gave the number of wins each team had, but it has a wider range of dates, spanning from 1998 to 2019. To keep all of my data relevant, I thinned out the data so that the range only includes the 15 years that the other two data sets include.
Rows: 512
Columns: 16
$ YEAR <dbl> 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002,…
$ Team <chr> "ARZ", "ATL", "BAL", "BUF", "CAR", "CHI", "CIN", "CLE", "DAL", "D…
$ W <dbl> 5, 9, 7, 8, 7, 4, 2, 9, 5, 9, 3, 12, 4, 10, 6, 8, 9, 6, 9, 9, 10,…
$ WL <dbl> 0.31, 0.56, 0.44, 0.50, 0.44, 0.25, 0.12, 0.56, 0.31, 0.56, 0.19,…
$ GP <dbl> 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 1…
$ PA <dbl> 548, 479, 479, 612, 464, 543, 591, 552, 471, 554, 577, 580, 447, …
$ PAG <dbl> 34.25, 29.94, 29.94, 38.25, 29.00, 33.94, 36.94, 34.50, 29.44, 34…
$ PC <dbl> 291, 268, 262, 377, 255, 310, 350, 338, 252, 359, 277, 361, 235, …
$ PY <dbl> 2740, 3167, 2847, 3995, 2694, 3051, 3476, 3412, 2621, 3824, 2994,…
$ PYG <dbl> 171.25, 197.94, 177.94, 249.69, 168.38, 190.69, 217.25, 213.25, 1…
$ PTD <dbl> 1.12, 1.12, 1.25, 1.50, 0.94, 1.38, 1.06, 1.69, 0.88, 1.31, 1.19,…
$ RA <dbl> 414, 523, 427, 388, 452, 382, 426, 406, 423, 457, 358, 451, 424, …
$ RAG <dbl> 25.88, 32.69, 26.69, 24.25, 28.25, 23.88, 26.62, 25.38, 26.44, 28…
$ RY <dbl> 1823, 2368, 1792, 1596, 1586, 1344, 1730, 1615, 1754, 2266, 1477,…
$ RYG <dbl> 113.94, 148.00, 112.00, 99.75, 99.12, 84.00, 108.12, 100.94, 109.…
$ RTD <dbl> 0.62, 1.44, 0.56, 1.06, 0.69, 0.50, 0.81, 0.62, 0.44, 1.31, 0.56,…
The Impact of Rush Attempts on Winning
The Impact of Rush Yards on Winning
The Impact of Rushing Touchdowns on Winning
Based on the data from the three scatter plots, all three statistics — rushing attempts per game, rushing yards per game, and rushing touchdowns per season — show a positive relationship with their win-loss ratio. This means that the higher the three variables are, the higher the chances are of teams winning games. However, the relationship between the variables are not all the same. Based on the best fit lines from the three graphs, rushing touchdowns lead to the highest win-loss ratio. This makes since, the more touchdowns a team scores, the higher the likely hood is that they will win that game.
The Impact of Pass Attempts on Winning
The Impact of Pass Yards on Winning
The Impact of Passing Touchdowns on Winning
Unlike the rushing variables, the passing variable’s relationship with win-loss ratio are not all positive. There is a negative relationship between pass attempts per game and win-loss ratio. Therefore, as teams throw the ball more per game, their winning chance decreases. This is the only negative relationship that exists, but it provides a little insight into what is a more effective style of offense. The other two variables - passing yards per game and passing touchdowns - both have positive relationships with win-loss ratio. Similar to the rushing touchdown variable, the relationship between passing touchdowns and win-loss ratio has the highest slope of the best fit lines among the three variables.
This graphs exist to show the difference between passing and rushing touchdowns. While both are important and impact the games. Passing touchdowns are more numerous. This is due to the shift of teams wanting to throw the ball more. When they do throw the ball more, the potential for passing touchdowns increases, but also according to the data so does the potential of teams losing the game due to potential turnovers and wasted plays.
Based on the data that was collected and the relationship between the various variables and team’s win-loss ratio, the best way to ensure teams win games is to score touchdowns. Both passing and rushing touchdowns had steep slopes when related to team’s win-loss ratio. Besides this obvious answer, teams win the best when they limit their passing attempts, but maximize their yards. With the relationship between passing attempts per game and win-loss ratio being the only negative relationship, teams should try and limit their amount of attempts at passing the ball. However, with a positive relationship between the passing yards and win-loss ratio, teams should limit their passing attempts, while still trying to maximize their yards pet attempt. Based on this teams should try and evenly mix their offensive attack between rushing and passing. When both styles of offensive give positive relationships between yards and win-loss ratio, it means that the more yards of either will lead to teams having more success. As long as they balance their attack and tend to not throw the ball too much, teams can have higher win-loss ratios.
Although I was overall happy with the data that I collected, I do wish I had a more up to data data set. While the data range goes up to 2017 a lot has changed since then. In 2017 Tom Brady won the Superbowl with one of the most popular comebacks in NFL history. Since then he has won 2 more, lost one, and retired twice. Also in that time the rules for the NFL have changed. Instead of playing 16 games, teams now play 17. I believe that this may have impacted teams ability to keep consistent offensive production. With the extra time the potential for injuries and loss of key players increases. Another limitation I had was based on the data being for the entire 16 game season. Teams tend to sit their best players for the final week to prevent injuries. Therefore, I wish there was a data set that had the data for the first 15 weeks of the year since there are probably some discrepancies with the data I collected.
---
title: "Analysis of Offense in the NFL"
author: "Sam Limbert"
output:
flexdashboard::flex_dashboard:
theme:
version: 4
bootswatch: default
navbar-bg: "Green"
orientation: columns
vertical_layout: fill
source_code: embed
---
Introduction
===
```{r setup, include=FALSE}
library(tidyverse)
library(dplyr)
rushing <- read_csv("rushing.csv")
passing <- read_csv("passing.csv")
football <- read_csv("football.csv")
football <- separate(football, team_code, into = c("Team", "YEAR"), sep = 3)
football <- football[football$YEAR >= 2002 & football$YEAR <= 2017, ]
football <- select(football, "wins", "YEAR", "completions", "pass attempts", "pass yards", "pass td", "rush yards", "rush td")
football <- football %>%
rename(PC = completions,
PA = "pass attempts",
PY = "pass yards",
PTD = "pass td",
RY = "rush yards",
RTD = "rush td")
rushing <- select(rushing, Team, YEAR, GP, Att, TD)
rushing <- rushing %>%
rename(RA = Att,
RTD = TD)
passing <- select(passing, Team, YEAR, GP, Att, Comp, TD)
passing <- passing %>%
rename(PA = Att,
PC = Comp,
PTD = TD)
combined_yds <- merge(rushing, passing, by = c("GP", "Team", "YEAR"), all = FALSE)
data <- merge(combined_yds, football, by = c("RTD", "PTD", "PC", "YEAR"), all = FALSE)
data <- data %>%
select(-PA.y)
data <- rename(data, PA = PA.x, W=wins)
data <- mutate(data, WL=W / GP, PYG=PY / 16, RYG = RY / 16, RAG= RA / 16, PAG = PA / 16, RTD = RTD / 16, PTD = PTD / 16)
data <- arrange(data, YEAR, Team)
data <- select(data, YEAR, Team, W, WL, GP, PA, PAG, PC, PY, PYG, PTD, RA, RAG, RY, RYG, RTD)
data$PYG <- round(data$PYG, digits = 2)
data$RYG <- round(data$RYG, digits = 2)
data$WL <- round(data$WL, digits = 2)
data$RAG <- round(data$RAG, digits = 2)
data$PAG <- round(data$PAG, digits = 2)
data$RTD <- round(data$RTD, digits = 2)
data$PTD <- round(data$PTD, digits = 2)
```
column {data-width=450}
---
### Introduction
<font size = 5>**The Impact of Rushing vs. Passing Offense in the NFL**</font>
In the National Football League or NFL, there are two main ways to score points. Rushing or passing the ball. Both methods have their own set of strengths and weaknesses. This leads to teams carefully planning how to divide their offensive playing time between the two styles. Sometimes teams plan on mostly passing the ball while other teams try and rush the ball for the majority of the time. This lead me to ponder what style of offense leads to teams winning more games. Is a ruthless passing attack the best way to win or is a violent ground game the path to victory?
### Research Questions
- How does the team’s rushing statistics per game affect the team's win-loss ratio?
- How does the team’s pass statistics per game affect the team's win-loss ratio?
- How should teams run their offense to achieve their best potential win-loss ratio?
Column {.tabset data-width=350}
---
### Method
To answer my research questions I pulled data from three different sources and two different Kaggle pages. The first and second sources can be found [here](https://www.kaggle.com/datasets/farmander/nfl-statistics?select=NFL+Team+Season+Stats+-+Passing.csv). This site includes the rushing and passing statistics for all teams from 2002 until 2017. Those two data sets were merged so that I could combine their unique variables. Then they were combined a second time with the third data set which can be found [here](https://www.kaggle.com/datasets/ttalbitt/american-football-team-stats-1998-2019?resource=download). This third data set gave the number of wins each team had, but it has a wider range of dates, spanning from 1998 to 2019. To keep all of my data relevant, I thinned out the data so that the range only includes the 15 years that the other two data sets include.
### Variable Description
- T: Team
- Y: Year
- W: Wins per year
- WL: Win-loss ratio
- GP: Games played per season
- PA: Pass attempts per season
- PAG: Average pass attempts per game
- PC: Passes caught per season
- PY: Pass yards per season
- PYG: Average pass yards per game
- PTD: Average passing touchdowns per game
- RA: Rushing attempts per season
- RAG: Average rushing attempts per game
- RY: Rushing yards per season
- RYG: Average rushing yards per game
- RTD: Average rushing touchdowns per game
### Glimpse of the Data
```{r}
glimpse(data)
```
Data
===
### Data Table
```{r}
library(DT)
datatable(data, options = list(
columnDefs = list(list()),
pageLength = 9,
lengthMenu = c(5, 10, 15, 20)
))
```
Rushing Effect on WL
===
column {.tabset data-width=600}
---
### RAG
<font size = 5>**The Impact of Rush Attempts on Winning**</font>
```{r scatterplot1, fig.height=3.6}
ggplot(data, aes(x = RAG, y = WL)) +
geom_point() +
geom_smooth(method="lm",se = FALSE, colour="blue") +
theme(axis.text = element_text(size = 11)) +
labs(x = "Average Rushing Attempts per Game", y = "Win-Loss Ratio")
```
### RYG
<font size = 5>**The Impact of Rush Yards on Winning**</font>
```{r scatterplot2, fig.height=3.6}
ggplot(data, aes(x = RYG, y = WL)) +
geom_point() +
geom_smooth(method="lm", se = FALSE, colour="blue") +
theme(axis.text = element_text(size = 11)) +
labs(x = "Average Rushing Yards per Game", y = "Win-Loss Ratio")
```
### RTD
<font size = 5>**The Impact of Rushing Touchdowns on Winning**</font>
```{r scatterplot3, fig.height=3.6}
ggplot(data, aes(x = RTD, y = WL)) +
geom_point() +
geom_smooth(method="lm",se = FALSE, colour="blue") +
theme(axis.text = element_text(size = 11)) +
labs(x = "Average Rushing Touchdowns per Game", y = "Win-Loss Ratio")
```
column { data-width=400}
---
### Analysis
Based on the data from the three scatter plots, all three statistics — rushing attempts per game, rushing yards per game, and rushing touchdowns per season — show a positive relationship with their win-loss ratio. This means that the higher the three variables are, the higher the chances are of teams winning games. However, the relationship between the variables are not all the same. Based on the best fit lines from the three graphs, rushing touchdowns lead to the highest win-loss ratio. This makes since, the more touchdowns a team scores, the higher the likely hood is that they will win that game.
Passing Effect on WL
===
column {.tabset data-width=600}
---
### PAG
<font size = 5>**The Impact of Pass Attempts on Winning**</font>
```{r scatterplot4, fig.height=3.6}
ggplot(data, aes(x = PAG, y = WL)) +
geom_point() +
geom_smooth(method="lm",se = FALSE, colour="darkgreen") +
theme(axis.text = element_text(size = 11)) +
labs(x = "Average Passing Attempts per Game", y = "Win-Loss Ratio")
```
### PYG
<font size = 5>**The Impact of Pass Yards on Winning**</font>
```{r scatterplot5, fig.height=3.6}
ggplot(data, aes(x = PYG, y = WL)) +
geom_point() +
geom_smooth(method="lm", se = FALSE, colour="darkgreen") +
theme(axis.text = element_text(size = 11)) +
labs(x = "Average Passing Yards per Game", y = "Win-Loss Ratio")
```
### PTD
<font size = 5>**The Impact of Passing Touchdowns on Winning**</font>
```{r scatterplot6, fig.height=3.6}
ggplot(data, aes(x = PTD, y = WL)) +
geom_point() +
geom_smooth(method="lm",se = FALSE, colour="darkgreen") +
theme(axis.text = element_text(size = 11)) +
labs(x = "Average Passing Touchdowns per Game", y = "Win-Loss Ratio")
```
column { data-width=400}
---
### Analysis
Unlike the rushing variables, the passing variable's relationship with win-loss ratio are not all positive. There is a negative relationship between pass attempts per game and win-loss ratio. Therefore, as teams throw the ball more per game, their winning chance decreases. This is the only negative relationship that exists, but it provides a little insight into what is a more effective style of offense. The other two variables - passing yards per game and passing touchdowns - both have positive relationships with win-loss ratio. Similar to the rushing touchdown variable, the relationship between passing touchdowns and win-loss ratio has the highest slope of the best fit lines among the three variables.
Touchdowns
===
column{ data-width=600}
---
```{r, fig.align='center', fig.width=10, fig.height=6.5}
library(ggplot2)
combined_plot <- ggplot(data) +
geom_point(aes(x = RTD, y = WL, color = "Rushing"), size = 3) +
geom_point(aes(x = PTD, y = WL, color = "Passing"), size = 3) +
scale_color_manual(name = "Touchdown Type",
values = c("Rushing" = "blue", "Passing" = "red")) +
labs(title = "Touchdowns vs WL",
x = "Touchdowns",
y = "WL")
combined_plot
```
column{ data-width=400}
---
### Analysis
This graphs exist to show the difference between passing and rushing touchdowns. While both are important and impact the games. Passing touchdowns are more numerous. This is due to the shift of teams wanting to throw the ball more. When they do throw the ball more, the potential for passing touchdowns increases, but also according to the data so does the potential of teams losing the game due to potential turnovers and wasted plays.
Conclusion
===
column {data-width=600}
---
### Conclusion
Based on the data that was collected and the relationship between the various variables and team's win-loss ratio, the best way to ensure teams win games is to score touchdowns. Both passing and rushing touchdowns had steep slopes when related to team's win-loss ratio. Besides this obvious answer, teams win the best when they limit their passing attempts, but maximize their yards. With the relationship between passing attempts per game and win-loss ratio being the only negative relationship, teams should try and limit their amount of attempts at passing the ball. However, with a positive relationship between the passing yards and win-loss ratio, teams should limit their passing attempts, while still trying to maximize their yards pet attempt. Based on this teams should try and evenly mix their offensive attack between rushing and passing. When both styles of offensive give positive relationships between yards and win-loss ratio, it means that the more yards of either will lead to teams having more success. As long as they balance their attack and tend to not throw the ball too much, teams can have higher win-loss ratios.
### Limitations
Although I was overall happy with the data that I collected, I do wish I had a more up to data data set. While the data range goes up to 2017 a lot has changed since then. In 2017 Tom Brady won the Superbowl with one of the most popular comebacks in NFL history. Since then he has won 2 more, lost one, and retired twice. Also in that time the rules for the NFL have changed. Instead of playing 16 games, teams now play 17. I believe that this may have impacted teams ability to keep consistent offensive production. With the extra time the potential for injuries and loss of key players increases. Another limitation I had was based on the data being for the entire 16 game season. Teams tend to sit their best players for the final week to prevent injuries. Therefore, I wish there was a data set that had the data for the first 15 weeks of the year since there are probably some discrepancies with the data I collected.
column { data-width=500}
---
### About the Author
My name is Sam Limbert and I am a sophomore at the University of Dayton pursuing a Bachelor’s of Science in Applied Mathematics in Economics with a minors in Data Analytics. I am projected to graduate in May 2026.
Feel free to connect with me on [LinkedIn](https://www.linkedin.com/in/sam-limbert-1405ba2a8/)
```{r}
htmltools::img(src = knitr::include_graphics("picsy.png"), width = 350)
```